21 research outputs found

    Low-effort place recognition with WiFi fingerprints using deep learning

    Full text link
    Using WiFi signals for indoor localization is the main localization modality of the existing personal indoor localization systems operating on mobile devices. WiFi fingerprinting is also used for mobile robots, as WiFi signals are usually available indoors and can provide rough initial position estimate or can be used together with other positioning systems. Currently, the best solutions rely on filtering, manual data analysis, and time-consuming parameter tuning to achieve reliable and accurate localization. In this work, we propose to use deep neural networks to significantly lower the work-force burden of the localization system design, while still achieving satisfactory results. Assuming the state-of-the-art hierarchical approach, we employ the DNN system for building/floor classification. We show that stacked autoencoders allow to efficiently reduce the feature space in order to achieve robust and precise classification. The proposed architecture is verified on the publicly available UJIIndoorLoc dataset and the results are compared with other solutions

    Hardware implementation of a decision tree classifier for object recognition applications

    No full text
    Hardware implementation of a widely used decision tree classifier is presented in this paper. The classifier task is to perform image-based object classification. The performance evaluation of the implemented architecture in terms of resource utilization and processing speed are reported. The presented architecture is compact, flexible and highly scalable and compares favorably to software-only solutions in terms of processing speed and power consumption

    FPGA implementation of a multiprocessor system performing the RANSAC algorithm

    No full text
    W artykule opisano programow膮, wieloprocesorow膮 realizacj臋 algorytmu RANSAC, kt贸ry umo偶liwia odporn膮 estymacj臋 modelu matematycznego z danych pomiarowych zawieraj膮cych znacz膮cy odsetek warto艣ci odstaj膮cych (ang. outliers). System zosta艂 zaimplementowany w uk艂adzie FPGA w oparciu o konfigurowalne soft procesory MicroBlaze. W pracy przedstawiono opis algorytmu RANSAC, spos贸b jego podzia艂u w celu przetwarzania r贸wnoleg艂ego, a tak偶e proces konfiguracji systemu wieloprocesorowego. Zaprezentowano r贸wnie偶 przyrost pr臋dko艣ci przetwarzania w zale偶no艣ci od liczby zastosowanych rdzeni procesorowych, por贸wnano te wyniki do realizacji na komputerze klasy PC i przedstawiono zu偶ycie zasob贸w uk艂adu FPGA.The paper describes a multiprocessor system implementing the RANSAC algorithm [3] which enables robust estimation of a fundamental matrix from a set of image keypoint correspondences containing some amount of outliers. The fundamental matrix encodes the relationship between two views of the same scene. The knowledge of the fundamental matrix enables e.g. the reconstruction of the scene structure. The implemented system is based on three MicroBlaze microprocessors [5] (one master, two slaves) and a dedicated hardware coprocessor connected using fast simplex link (FSL) interfaces [6]. The slave microprocessors perform the task of fundamental matrix computation from point correspondences using singular value decomposition - the so called 8-point algorithm [1, 2] (hypothesis generation). The master processor, along with the connected coprocessor, is responsible for dataflow handling and hypothesis testing using the Sampson error formula (7). The hypothesize and test framework used in RANSAC allows for largely independent task execution. The design is a development of a system described in [5]. The block diagram and dataflow diagram of the proposed solution are given in Figs. 1 and 2, respectively. Tabs. 1 and 2 summarize the use of FPGA resources. With a 100 MHz clock, the designed system is capable of processing the data at the speed which is roughly equivalent to that of the Atom N270 microprocessor clocked at 1,2 GHz. The resulting solution will be targeted at applications for which small size, weight and power consumption are critical. The design is also easily scalable - addition of more slave processors will result in additional increase in the processing speed

    Comparison of hardware implementations of two popular corner detectors

    No full text
    W artykule zaprezentowano sprz臋towe implementacje dw贸ch detektor贸w naro偶nik贸w - detektora Harrisa i detektora FAST - w strukturach FPGA. Pr臋dko艣膰 przetwarzania nie ust臋puje pr臋dko艣ci uzyskiwanej na wsp贸艂czesnych komputerach osobistych, jednak偶e zastosowanie niedrogich uk艂ad贸w FPGA umo偶liwia ograniczenie poboru mocy, a tak偶e kosztu oraz wymiar贸w kompletnego systemu. W artykule zawarto opis obu algorytm贸w, schematy blokowe ich sprz臋towych implementacji, a tak偶e podsumowanie i por贸wnanie ilo艣ci zasob贸w uk艂adu FPGA wykorzystywanych przez obie implementacje. Wykonano r贸wnie偶 wst臋pn膮 analiz臋 wynik贸w uzyskanych przez zastosowanie zaimplementowanych detektor贸w na sekwencji obraz贸w.Many contemporary computer and machine vision applications require finding corresponding points in image sequences. For that purpose many point feature detectors have been developed. Most of them detect corners, i.e. points that mark object boundaries, or boundaries of significant object parts as features. In this paper there are presented the implementations of two popular corner detectors - the Harris [2] and FAST [3] corner detector - in FPGA structure. The proposed solutions enable processing of 512x512 pixel, 8-bit grayscale image data with the speed of over 400 frames per second (FAST), and over 350 frames per second (Harris). The processing speeds are the same or even better than those that can be achieved using modern high-performance PCs. FPGA implementations, however, are less power-hungry, relatively inexpensive and more compact, which is critical in many applications. Our implementations are targeted at applications in mobile robotics. The paper contains a short description of the implemented algorithms, block diagrams of the implemented architectures, as well as the summary of the FPGA resources required by both implementations. A preliminary analysis of performance of the implemented algorithms with regards to feature repeatability is also carried out. The results show that the implementation of the FAST algorithm has better performance in terms of speed. Also, the FAST algorithm performs better on image sequences with strong structure - urban, interiors etc. The Harris detector implementation, although in general slower and a little more resource-hungry than the FAST implementation (requires hardware multipliers), demonstrates better performance on poorly structured scene sequences - grass, dirt roads etc. These conclusions are consistent with the results of research carried out before [3, 4]

    Hardware implementation of background subtraction algorithm

    No full text
    W pracy przedstawiono implementacj臋 w strukturze FPGA systemu detekcji obiekt贸w ruchomych wykorzystuj膮cego metod臋 przybli偶onej mediany. W celu poprawy wynik贸w zastosowano modyfikacj臋 algorytmu, polegaj膮c膮 na poddaniu obrazu r贸偶nicowego dzia艂aniu filtra u艣redniaj膮cego, oraz maksymalnego. Ca艂o艣膰 systemu zrealizowano w architekturze sprz臋towo-programowej, opartej o mikroprocesor Microblaze wraz z dedykowanym procesorem sprz臋towym pod艂膮czony przez interfejs FSL.The paper presents the FPGA implementation of a moving object detection system, based on the approximate median algorithm [1]. The method, despite its simplicity and low memory requirement, offers good detection quality [2]. To further improve the results, the original algorithm was modified by applying additional averaging and maximal filtering to the difference image [3]. The system is implemented as hybrid hardware/ software architecture, based on the Microblaze microprocessor [4], along with a dedicated coprocessor connected to it via the FSL (Fast Simplex Link) interface [5]. The microprocessor works under the control of the Xilkernel operating system, along with the LwIP TCP/IP stack, which allows transferring data through Ethernet. The software part of the algorithm performs the task of receiving the input image data, computing the difference image, and updating the background model accordingly. The difference image is then filtered by the Gaussian and maximum filter are implemented as a single hardware coprocessor. The processed data is sent back to the PC. Table 1 presents the summary of resources used for the implementation. Figure 1 outlines the system architecture. Figures 2 and 3 show the detailed coprocessor structure. The implemented system is capable of processing over ten 256x256, 8-bit grayscale image frames per second using an inexpensive Spartan-3E FPGA with 50MHz clock (see Fig. 4)

    The performance comparison of the DMA subsystem of the Zynq SoC in bare metal and Linux applications

    No full text
    The paper presents results of comparison of the direct memory access (DMA) performance in a Zynq SoC based system working in a bare metal configuration and running the Linux operating system (OS). The overhead introduced by the driver and software components of the Linux OS is evaluated and analyzed. The evaluation is performed on a real life video processing usage scenario involving transfers of significant portions of data to- and from the memory

    A flexible, high performance hardware implementation of the simplified histogram of oriented gradients descriptor

    No full text
    In this paper, a high performance, configurable, compact hardware architecture for computing the histogram of oriented gradients (HoG) descriptors is presented. The descriptor computation algorithm is simplified w.r.t. to the original solution, enabling hardware resource cost reduction with only a small accuracy penalty. The proposed architecture can be accommodated to different block sizes and different block grid configurations, enabling its use in a wide range of object detection and recognition tasks with varying region of interest sizes. The resulting architecture is systolic and massively parallel, enabling high throughput processing

    Comparative assessment of point feature detectors in the context of robot navigation

    No full text
    This paper presents evaluation of various contemporary interest point detector and descriptor pairs in the context of robot navigation. The robustness of the detectors and descriptors is assessed using publicly available datasets: the first gathered from the camera mounted on the industrial robot [17] and the second gathered from the mobile robot [20]. The most efficient detectors and descriptors for the visual robot navigation are selected

    On augmenting the visual slam with direct orientation measurement using the 5-point algorithm

    No full text
    This paper presents the attempt to merge two paradigms of the visual robot navigation: Visual Simultaneous Localization and Mapping (VSLAM) and Visual Odometry (VO). The VSLAM was augmented with the direct, visual measurement of the robot orientation change using the 5-point algorithm. The extended movement model of the robot was proposed and additional measurements were introduced to the SLAM system. The efficiency of the 5-point and 8-point algorithms was compared. The augmented system was compared with the state of the art VSLAM solution and the proposed modification allowed to reduce the tracking error by over 30%

    A hardware system for muscle force and tiredness estimation from electromyo-graphic signal

    No full text
    W pracy przedstawiono implementacj臋 uk艂adu s艂u偶膮cego do estymacji si艂y oraz zm臋czenia mi臋艣ni na podstawie sygna艂u elektromiograficznego (EMG), rejestrowanego za pomoc膮 dwukana艂owego wzmacniacza, oraz po艂o偶enia stawu mierzonego za pomoc膮 enkodera kwadraturowego. W matrycy FPGA zaimplementowano struktury obliczaj膮ce aktualn膮 warto艣膰 艣redniej cz臋stotliwo艣ci (MNF) oraz warto艣ci 艣redniokwadratowej (RMS) sygna艂u i k膮ta, co umo偶liwia estymacj臋 aktualnej si艂y oraz zm臋czenia. Opracowane rozwi膮zanie jest skalowalne i umo偶liwia r贸wnoleg艂膮 obs艂ug臋 liczby kana艂贸w ograniczonej wy艂膮cznie zasobami matrycy FPGA.This paper presents an FPGA implementation of the muscle force and fatigue estimation unit based on the analysis of an electromyography (EMG) signal measured with a two-channel amplifier and the joint position measured with a quadratic encoder. The contemporary systems use the root mean square (RMS) of the EMG signal and muscle length to estimate the contraction force and decrease in the median frequency of the EMG signal to detect the muscle fatigue [2]. The proposed system consists of (Fig. 1): an infinite impulse response (IIR) high-pass filter with the cut-off frequency of 10 Hz, a dedicated RMS calculation block for the 512 samples window (Fig. 2.), the Fast Fourier Transform (FFT) block and a MicroBlaze processor. The muscle length is estimated using measurements from the encoder placed on the joint. The mean value of the EMG signal frequencies is used as the approximation of the median-frequency. The system was tested using the Xilinx SP605 evaluation kit and the obtained results were verified. The resources usage is presented in Table 1. Due to the FPGA inherent ability to parallelize computation, additional measurement channels can be easily added without increase in the processing time. The presented system is portable and can be used as a part of any mobile solution requiring feedback from the muscles-state (e.g. exoskeleton). Due to its scalability, it can be easily extended into a larger muscle-analysis system. Moreover, it can be modified to facilitate analysis of other biomedical signals
    corecore